Revamped tuning #130

GardevoirX · 2024-12-17T16:12:49Z

This PR introduces two things:

A new tuning scheme based on grid-search (see Grid-searching based tuning scheme #110)
A new example by @ceriottm on the tuning (10-tuning)

Still more works need to be done, like writing documentations, fixing the pytests and the example, before this PR is ready.

Contributor (creator of pull-request) checklist

Tests updated (for new features and bugfixes)?
Documentation updated (for new features)?
Issue referenced (for PRs that solve an issue)?

Reviewer checklist

CHANGELOG updated with public API or any other important changes?

📚 Documentation preview 📚: https://torch-pme--130.org.readthedocs.build/en/130/

PicoCentauri · 2024-12-18T10:50:39Z

src/torchpme/utils/tuning/grid_search.py

+def grid_search(
+    method: str,
+    charges: torch.Tensor,


I would turn the logic around and keep the tune_XXX method. Also, grid_search is a very common name. It is not really clear from this that this will find the optimal parameters for the methods.

PicoCentauri · 2024-12-20T21:04:04Z

examples/05-autograd-demo.py

@@ -515,3 +518,82 @@ def forward(self, positions, cell, charges):
 print(f"Evaluation time:\nPytorch: {time_python}ms\nJitted:  {time_jit}ms")

 # %%
+# Other auto-differentiation ideas


IMHO opinion I wouldn't put this example here - even though I think it is good to have it. The tutorial is already 500 lines and with this super long. I rather vote for smaller examples tackling one specific tasks. Finding solutions is much easier if they are shorter. See also the beloved matplotlib examples.

docs/src/references/utils/tuning.rst

PicoCentauri

Why are there three Tuning base classes

TuningErrorBounds, TuningTimings which are hardcoded inside GridSearchBase? I think a single base class is enough, no?

src/torchpme/utils/tuning/__init__.py

PicoCentauri · 2025-01-07T20:52:21Z

src/torchpme/utils/tuning/p3m.py

+    CalculatorClass = P3MCalculator
+    GridSearchParams = {
+        "interpolation_nodes": [2, 3, 4, 5],
+        "mesh_spacing": 1 / ((np.exp2(np.arange(2, 8)) - 1) / 2),


Shouldn't we give the users the option to choose possibility to give the grid points on which they want to optimize.

This is mainly because the possible grid points were hard-coded before. If we can do this, would be good. We let the user input a list of their desired mesh_spacing at the beginning?

PicoCentauri · 2025-01-07T20:53:45Z

src/torchpme/utils/tuning/__init__.py

+            )
+            value = result.sum()
+            if self._run_backward:
+                value.backward(retain_graph=True)


why do you need to retain the graph here?

PicoCentauri · 2025-01-07T20:56:03Z

src/torchpme/utils/tuning/__init__.py

+                positions.requires_grad_(True)
+                cell.requires_grad_(True)
+                charges.requires_grad_(True)
+            execution_time -= time.time()


This looks a very weird way of storing the result. why not using a temp variable?

Suggested change

execution_time -= time.time()

t0 = time.time()

See below.

PicoCentauri · 2025-01-07T20:56:23Z

src/torchpme/utils/tuning/__init__.py

+
+            if self._device is torch.device("cuda"):
+                torch.cuda.synchronize()
+            execution_time += time.time()


Suggested change

execution_time += time.time()

execution_time += t0 - time.time()

PicoCentauri · 2025-01-07T20:58:39Z

src/torchpme/utils/tuning/__init__.py

+        self._charges = charges
+        self._cell = cell
+        self._positions = positions
+        self._dtype = charges.dtype
+        self._device = charges.device
+        self._n_repeat = n_repeat
+        self._n_warmup = n_warmup
+        self._run_backward = run_backward


Do you really need all of these private properties?

Many of these seem to be only used once and are hardcoded.

Also I think user variables should be stored public.

If I pass positions I should be able to access them via self.positions and not as a private property.

src/torchpme/utils/tuning/__init__.py

examples/10-tuning.py

PicoCentauri · 2025-01-09T08:48:10Z

Some notes from our meeting

give List[Dict] and class to the init of the tuner class where the dict can be a named or unnamed parameters passed to the init the calculator class.
class returns a List[floats] where floats are the obtained timings
user facing function i.e. tune_ewald will create the grid and returns the best parameters

PicoCentauri

Update docstrings and tests

write a doc page API page explaining the base class and how we do the tuning (Reuse the text for updating the paper). In the API references I would do a new section tuning. On the tuning page I would explain how we do the tuning. Then create one page for each calculator and finally one page for the base classes. One the base class page you explain how you designed these classes and how they work together.

The subpages for each calculator should first display the tuning function and below the classes for the error bounds. In the introduction text of each display the equation for error bounds.

PicoCentauri · 2025-01-15T11:05:47Z

src/torchpme/tuning/base.py

+        positions: torch.Tensor,
+        cutoff: float,
+        calculator,
+        params: list[dict],


I think you don't need the exponent. Should be able to extract it from the calculator.

but the Potential does not necessarily have the attribute exponent, like CoulombPotential 🤔

src/torchpme/tuning/base.py

PicoCentauri · 2025-01-15T11:08:02Z

src/torchpme/tuning/base.py

+        self._dtype = cell.dtype
+        self._device = cell.device
+


Should we put the dtype here as argument or deduce it from the claculator.

What do you say @E-Rum ?

PicoCentauri · 2025-01-15T11:09:33Z

src/torchpme/tuning/base.py

+    @staticmethod
+    def _validate_parameters(


This is now very similar to the one we use in the calculators, right?

Maybe we extact merge both and make them an standalone private function living in utils.

They are still slightly different from each other. The one in the calculators checks smearing while the one of tuning checks exponent, but it is possible to extract the common part

Okay yeah might be useful to have some code sharing if possible.

After getting the standalone functions, do we only call it in the tuning functions, or we still keep it being called during the initialization of the tuner?

src/torchpme/tuning/error_bounds.py

src/torchpme/tuning/tuner.py

src/torchpme/tuning/base.py

PicoCentauri

Yes I am happy with the design. I left some initial comments but we can start making the code ready to go in.

docs/src/references/index.rst

docs/src/references/tuning/base_classes.rst

docs/src/references/tuning/tune_ewald.rst

src/torchpme/tuning/error_bounds.py

PicoCentauri

Very good progress. The overall structure looks convincing to me. I have one major question about the importance of the backward path in the tuning.

If we can convince me that this is important we can keep it. Otherwise I suggest only time the forward.

PicoCentauri · 2025-01-19T10:09:54Z

examples/10-tuning.py

This example lacks explanantions. Linking to the error formulas might be useful plus some more text between the cells explaining the last cells and the plans for the upcoming code.

PicoCentauri · 2025-01-19T10:12:32Z

src/torchpme/calculators/calculator.py

+        assert isinstance(
+            potential, Potential
+        ), f"Potential must be an instance of Potential, got {type(potential)}"


I would rather raise a ValueError here. Asserts are usually for testing and if a code should check something under all circumstances one should use a "real" error. See for example

https://stackoverflow.com/questions/40182944/whats-the-difference-between-raise-try-and-assert

I wonder if TypeError is better? And what about those assertions below these lines?

torch-pme/src/torchpme/calculators/calculator.py

Lines 51 to 58 in 410bf74

assert self.dtype == self.potential.dtype, (

f"Potential and Calculator must have the same dtype, got {self.dtype} and "

f"{self.potential.dtype}"

)

assert self.device == self.potential.device, (

f"Potential and Calculator must have the same device, got {self.device} and "

f"{self.potential.device}"

)

PicoCentauri · 2025-01-19T10:13:30Z

src/torchpme/tuning/ewald.py

+    r"""
+    Find the optimal parameters for :class:`torchpme.EwaldCalculator`.
+
+    The error formulas are given `online


I think we don't need this anymore here. We give the equations in the documentation.

Would it be nicer to let users know the origin of these equations in the documentation? Otherwise, for me if I am a newcomer, I might think you guys fabricated them 😂 Or we move them to where the equations are?

PicoCentauri · 2025-01-19T10:15:15Z

src/torchpme/tuning/ewald.py

+    Error bounds for :class:`torchpme.calculators.ewald.EwaldCalculator`.
+
+    The error formulas are given `online
+    <https://www2.icp.uni-stuttgart.de/~icp/mediawiki/images/4/4d/Script_Longrange_Interactions.pdf>`_
+    (now not available, need to be updated later). Note the difference notation between
+    the parameters in the reference and ours:
+
+    .. math::
+
+        \alpha &= \left( \sqrt{2}\,\mathrm{smearing} \right)^{-1}
+
+        K &= \frac{2 \pi}{\mathrm{lr\_wavelength}}
+
+        r_c &= \mathrm{cutoff}


This docstring seems to be very similar to the tuning function.

Maybe write someting that this class implemenents the error bounds for the real and FT part of the ewald summation ...

src/torchpme/tuning/ewald.py

src/torchpme/tuning/tuner.py

PicoCentauri · 2025-01-19T10:37:20Z

src/torchpme/tuning/tuner.py

+                positions.requires_grad_(True)
+                cell.requires_grad_(True)
+                charges.requires_grad_(True)
+            execution_time -= time.time()


I think we should use time.monotonic(). It seems to be better suited for the timings we do here.

See: https://docs.python.org/3/library/time.html#time.monotonic

PicoCentauri · 2025-01-19T10:39:50Z

src/torchpme/tuning/tuner.py

+            if self._run_backward:
+                positions.requires_grad_(True)
+                cell.requires_grad_(True)
+                charges.requires_grad_(True)


Are we sure this is necessary. Is this backward path that uncorrelated to the forward path. Naively I would imagine that if the forward path takes longer it is the same for the backward and this this does not include forces, I don't really see the point.

… revamped-tuning

ceriottm · 2025-01-22T15:41:37Z

I'm done with a more-than-decent draft of the example. It explains well how the tuning is done, and then shows how to use the autotuner to also optimize the cutoff.

PicoCentauri · 2025-01-22T15:43:31Z

Thanks, I take it from here.

… revamped-tuning

Revert the default parameter

PicoCentauri

Thanks for this large change @GardevoirX and @ceriottm !

PicoCentauri · 2025-01-20T15:35:01Z

src/torchpme/tuning/pme.py

+        smearing = torch.as_tensor(smearing)
+        mesh_spacing = torch.as_tensor(mesh_spacing)
+        cutoff = torch.as_tensor(cutoff)
+        interpolation_nodes = torch.as_tensor(interpolation_nodes)


Why are you using as_tensor instead of tensor?

PicoCentauri · 2025-01-20T15:41:54Z

src/torchpme/tuning/pme.py

+    ... )
+
+    """
+    _validate_parameters(charges, cell, positions, exponent)


okay makes sense.

GardevoirX linked an issue Dec 17, 2024 that may be closed by this pull request

Grid-searching based tuning scheme #110

Closed

PicoCentauri reviewed Dec 18, 2024

View reviewed changes

ceriottm force-pushed the revamped-tuning branch from 367f147 to f554b70 Compare December 20, 2024 07:58

PicoCentauri reviewed Dec 20, 2024

View reviewed changes

PicoCentauri reviewed Jan 3, 2025

View reviewed changes

docs/src/references/utils/tuning.rst Outdated Show resolved Hide resolved

GardevoirX marked this pull request as ready for review January 7, 2025 12:57

PicoCentauri reviewed Jan 7, 2025

View reviewed changes

GardevoirX force-pushed the revamped-tuning branch 2 times, most recently from a41f780 to 33c9705 Compare January 7, 2025 22:18

PicoCentauri reviewed Jan 15, 2025

View reviewed changes

src/torchpme/tuning/error_bounds.py Outdated Show resolved Hide resolved

PicoCentauri reviewed Jan 15, 2025

View reviewed changes

src/torchpme/tuning/tuner.py Outdated Show resolved Hide resolved

PicoCentauri reviewed Jan 15, 2025

View reviewed changes

src/torchpme/tuning/base.py Outdated Show resolved Hide resolved

PicoCentauri reviewed Jan 15, 2025

View reviewed changes

GardevoirX force-pushed the revamped-tuning branch from 250d8c6 to 135e6d9 Compare January 15, 2025 23:19

PicoCentauri reviewed Jan 16, 2025

View reviewed changes

docs/src/references/index.rst Outdated Show resolved Hide resolved

docs/src/references/tuning/base_classes.rst Show resolved Hide resolved

docs/src/references/tuning/tune_ewald.rst Show resolved Hide resolved

src/torchpme/tuning/error_bounds.py Outdated Show resolved Hide resolved

GardevoirX force-pushed the revamped-tuning branch from d8e675e to 4c22071 Compare January 16, 2025 19:19

PicoCentauri reviewed Jan 19, 2025

View reviewed changes

GardevoirX and others added 11 commits January 19, 2025 14:32

Initial version of grid_search

519ebd3

Remove error

830e0f7

Allow a precomputed nl

1b80e93

Renamed examples, and added a tuning playground

1e346b8

Nelder mead (doesn't work because actual error is not a good target)

3e35d74

Added a tuning class

4a935f1

I'm not a morning person it seems

628a11a

Examples

8f1f543

Better plotting

4aa010c

Fixes on H and RMS_phi

9f6e2fc

Some cleaning and test fix

cc69e8c

GardevoirX and others added 8 commits January 20, 2025 20:05

Minor

adc7b4e

Formatting

1485e09

Update default warmup rounds

2b4cca7

Started work on the tuning example

4241a38

Merge branch 'revamped-tuning' of github.com:lab-cosmo/torch-pme into…

96ba959

… revamped-tuning

WIP

74caf9f

More work

3d00e44

Helpstring update and tuning function return update

1cbffed

PicoCentauri mentioned this pull request Jan 22, 2025

Remove utils directory #149

Merged

4 tasks

Cleaning and finishing up the example

42a5afc

GardevoirX and others added 2 commits January 22, 2025 16:43

Merge branch 'main' into revamped-tuning

8ef6b73

Cleaning and finishing up the example

1bac763

GardevoirX force-pushed the revamped-tuning branch from 42a5afc to 1bac763 Compare January 22, 2025 15:47

GardevoirX and others added 11 commits January 22, 2025 17:00

Fix pytests

095ea3f

Merge branch 'revamped-tuning' of github.com:lab-cosmo/torch-pme into…

e6ce9c1

… revamped-tuning

Merge branch 'revamped-tuning' of github.com:lab-cosmo/torch-pme into…

f4a7be3

… revamped-tuning

Make example faster by reducing replication of cell

374e226

fix docs

5dfa218

fix tests

13481d6

fix linter

a14beb6

Fix timing tests

4a9c546

Revert the default parameter

Disable timing test

1545c97

Update changelog.rst

6daacf3

Add gallery for tuning classes

6621663

PicoCentauri approved these changes Jan 23, 2025

View reviewed changes

Merge branch 'main' into revamped-tuning

0c02558

PicoCentauri merged commit 9e8ece1 into main Jan 23, 2025
13 checks passed

PicoCentauri deleted the revamped-tuning branch January 23, 2025 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revamped tuning #130

Revamped tuning #130

GardevoirX commented Dec 17, 2024 •

edited by github-actions bot

Loading

PicoCentauri Dec 18, 2024

PicoCentauri Dec 20, 2024

PicoCentauri left a comment

PicoCentauri Jan 7, 2025

GardevoirX Jan 7, 2025

PicoCentauri Jan 7, 2025

PicoCentauri Jan 7, 2025

PicoCentauri Jan 7, 2025

PicoCentauri Jan 7, 2025

PicoCentauri commented Jan 9, 2025

PicoCentauri left a comment

PicoCentauri Jan 15, 2025

GardevoirX Jan 15, 2025

PicoCentauri Jan 15, 2025

PicoCentauri Jan 15, 2025

GardevoirX Jan 15, 2025

PicoCentauri Jan 15, 2025

GardevoirX Jan 16, 2025

PicoCentauri left a comment

PicoCentauri left a comment

PicoCentauri Jan 19, 2025

PicoCentauri Jan 19, 2025

GardevoirX Jan 19, 2025

PicoCentauri Jan 19, 2025

GardevoirX Jan 19, 2025

PicoCentauri Jan 19, 2025

PicoCentauri Jan 19, 2025

PicoCentauri Jan 19, 2025

ceriottm commented Jan 22, 2025

PicoCentauri commented Jan 22, 2025

PicoCentauri left a comment

PicoCentauri Jan 20, 2025

PicoCentauri Jan 20, 2025

	execution_time += time.time()
	execution_time += t0 - time.time()

	assert self.dtype == self.potential.dtype, (
	f"Potential and Calculator must have the same dtype, got {self.dtype} and "
	f"{self.potential.dtype}"
	)
	assert self.device == self.potential.device, (
	f"Potential and Calculator must have the same device, got {self.device} and "
	f"{self.potential.device}"
	)

Revamped tuning #130

Revamped tuning #130

Conversation

GardevoirX commented Dec 17, 2024 • edited by github-actions bot Loading

Contributor (creator of pull-request) checklist

Reviewer checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PicoCentauri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PicoCentauri commented Jan 9, 2025

PicoCentauri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PicoCentauri left a comment

Choose a reason for hiding this comment

PicoCentauri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ceriottm commented Jan 22, 2025

PicoCentauri commented Jan 22, 2025

PicoCentauri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GardevoirX commented Dec 17, 2024 •

edited by github-actions bot

Loading